Seeing the Forest Through the Trees: Learning a Comprehensible Model from an Ensemble
نویسندگان
چکیده
Ensemble methods are popular learning methods that usually increase the predictive accuracy of a classifier though at the cost of interpretability and insight in the decision process. In this paper we aim to overcome this issue of comprehensibility by learning a single decision tree that approximates an ensemble of decision trees. The new model is obtained by exploiting the class distributions predicted by the ensemble. These are employed to compute heuristics for deciding which tests are to be used in the new tree. As such we acquire a model that is able to give insight in the decision process, while being more accurate than the single model directly learned on the data. The proposed method is experimentally evaluated on a large number of UCI data sets, and compared to an existing approach that makes use of artificially generated data.
منابع مشابه
Seeing the Forest through the Trees Learning a Comprehensible Model from a First Order Ensemble
Ensemble methods are popular learning methods that are usually able to increase the predictive accuracy of a classifier. On the other hand, this comes at the cost of interpretability, and insight in the decision process of an ensemble is hard to obtain. This is a major reason why ensemble methods have not been extensively used in the setting of inductive logic programming. In this paper we aim ...
متن کاملApplication of ensemble learning techniques to model the atmospheric concentration of SO2
In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...
متن کاملEnsemble Classification and Extended Feature Selection for Credit Card Fraud Detection
Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...
متن کاملRule Extraction from Ensemble Methods Using Aggregated Decision Trees
Ensemble methods have become very well known for being powerful pattern recognition algorithms capable of achieving high accuracy. However, Ensemble methods produces learners that are not comprehensible or transferable thus making them unsuitable for tasks that require a rational justification for making a decision. Rule Extraction methods can resolve this limitation by extracting comprehensibl...
متن کاملEnsemble of M5 Model Tree Based Modelling of Sodium Adsorption Ratio
This work reports the results of four ensemble approaches with the M5 model tree as the base regression model to anticipate Sodium Adsorption Ratio (SAR). Ensemble methods that combine the output of multiple regression models have been found to be more accurate than any of the individual models making up the ensemble. In this study additive boosting, bagging, rotation forest and random subspace...
متن کامل